A novel scale based on biomarkers associated with COVID-19 severity can predict the need for hospitalization and intensive care, as well as enhanced probabilities for mortality

Prognostic scales may help to optimize the use of hospital resources, which may be of prime interest in the context of a fast spreading pandemics. Nonetheless, such tools are underdeveloped in the context of COVID-19. In the present article we asked whether accurate prognostic scales could be developed to optimize the use of hospital resources. We retrospectively studied 467 files of hospitalized patients after COVID-19. The odds ratios for 16 different biomarkers were calculated, those that were significantly associated were screened by a Pearson’s correlation, and such index was used to establish the mathematical function for each marker. The scales to predict the need for hospitalization, intensive-care requirement and mortality had enhanced sensitivities (0.91 CI 0.87–0.94; 0.96 CI 0.94–0.98; 0.96 CI 0.94–0.98; all with p < 0.0001) and specificities (0.74 CI 0.62–0.83; 0.92 CI 0.87–0.96 and 0.91 CI 0.86–0.94; all with p < 0.0001). Interestingly, when a different population was assayed, these parameters did not change considerably. These results show a novel approach to establish the mathematical function of a marker in the development of highly sensitive prognostic tools, which in this case, may aid in the optimization of hospital resources. An online version of the three algorithms can be found at: http://benepachuca.no-ip.org/covid/index.php

www.nature.com/scientificreports/ (ICR) and hospitalization requirement (HR). Mortality was defined as death occurring within hospitalization; ICR was defined as the use of the ICU staff and facilities for at least one day; and HR was defined as patients requiring a respiratory support superior to 5 L/min for at least one day (because this requirement is most likely unsustainable in an at home treatment setting). The control patients for HR, ICR or mortality were those that did not required more than 10L/min of oxygen supplementation, intensive care or did not died within the hospital, respectively. The percentage of lung infiltration was evaluated as described elsewhere [17][18][19] using the Chest CT Score. Briefly, chest computed tomography studies were evaluated by two independent researchers whom divided the lungs into five anatomical regions (one for each lobule), and assigned each one up to five points depending on the percentage of the parenchyma that was infiltrated, adding the points at the end to a maximum of 25 points.
The level of association of each marker with the three different outcomes was initially assessed by calculating the odds ratio (OR), and the markers were considered to have a significant association with a particular outcome when p ≤ 0.05. For this test the reference values were those published elsewhere 20 . We then performed a Kolmogorov-Smirnov test in order to determine the type of distribution of the data (data not shown). Then, the patients' values were transformed in a binary manner, considering their value as "0" when they did not surpassed their reference values, and as "1" when they did. The binary data for significant associated markers was used to perform a Pearson's correlation to estimate their weight or mathematical function, but only those that had a Pearson's correlation index ≥ 0.20 were integrated into the pertinent algorithm. These consisted on the addition of the function of each marker (Pearson's index) when the patient exceeded the marker's reference values. Importantly, being that our data followed a non-Gaussian distribution, we opted for the Sperman's correction for the Pearson's test.
The individual patient's outcome predictor (OP) values were calculated and plotted into a Receiver Operating Characteristic (ROC) curve to estimate the sensitivity of each algorithm in the prediction of the aforementioned outcomes. Furthermore, the mean and standard deviation of the control group (negative for each outcome) were added, while the standard deviation was deducted from the mean of the outcome-positive group, and the middle point between each operation's results was found to calculate the cutoff value. Sensitivity (SE), specificity (SP), positive (PPV) and negative predictive (NPV) values, as well as the OR and Chi 2 values were then calculated to investigate each algorithm's characteristics. All the statistic tests were performed and graphed using GraphPad Prism X9, and significant differences were considered when p ≤ 0.05.
A protocol for this study was evaluated by the Institutional Committee of Research Ethics of the Sociedad Española de Beneficencia (Pachuca, Hidalgo) and the study was approved on February 24th of 2020. Our sponsor had no role in study design. All methods were performed in accordance with the relevant guidelines and regulations, including the Declaration of Helsinki, and written informed consent was obtained from all the patients studied.

Results
We assessed a total of 467 clinical files belonging to patients that were hospitalized at Sociedad Española de Beneficencia and Hospital Español, from March 12th 2020 to August 1st 2022. All files were analyzed, and 422 were found to be suitable for analysis. 255 files were allocated to algorithm design and 167 were used to validate the algorithms (Supp. Fig. S1). The only criteria for such allocation was to use the files belonging to Hospital Español in the design of the algorithms, while the files from Sociedad Española de Beneficencia were used to validate the tools with a different population. The patients whom contributed with the data for algorithm design were unvaccinated against SARS-CoV-2, while only 26% of the patients that provided data for the algorithms' validation had already received such treatment.
Of the 255 clinical files that we used to calculate the algorithm 175 (74.5%) belonged to patients that were retrospectively found to have a justifiable hospitalization, while 59 (25.5%) did not develop characteristics that made hospitalization mandatory over their whole hospital stay. Moreover, 125 (49.6%) patients required IC and 79 (31.1%) died at the hospital.
On the other hand, given that severely affected chest tomography findings (% inf) 21 , C-reactive protein (CRP), d-dimer, neutrophils, lymphocytes, lactate dehydrogenase (LDH) 22 , procalcitonin, medium arterial pressure (MAP), creatinine, leukocytes, aspartate aminotransferase (AST) 23,24 , ferritin, oxygen saturation (sO 2 ) 25 , and advanced age and comorbidities 26 have been associated with COVID-19 progression, the exact values of these markers were extracted from the complete clinical files of the participants. Upon gathering the laboratory data for the first 24 h of hospitalization, the OR for each of these markers was calculated in relation to HR (Supp. The markers with significant associations were plotted into a heat map and their Pearson's correlation coefficient was calculated (Fig. 1a,c,e) and used to determine the relative weight, or mathematical function, of each variable into each of the three algorithms. Only four variables had a Pearson's correlation index ≥ 0.20 in relation to each outcome, and thus were considered for the development of the algorithms, being Kirby < 200, LDH > 211, CRP > 120 and sO 2 < 80 important for the prognostic of HR; Rx > 14, Kirby < 200, CRP > 120, and LDH > 400 for the prediction of ICR; as well as age > 60, Kirby < 150, CRP > 120 and Rx > 15 for mortality (Fig. 1b,d,f).  www.nature.com/scientificreports/ To calculate the OP score for each patient, the Pearson's index belonging to each variable (Fig. 1b,d,f) was added each time a particular patient presented an abnormal level of a particular marker, and then both control (outcome negative) and experimental (outcome positive) patients' values were used to calculate the area under the ROC curve (AUROC). The COVID-hospitalization outcome prognostic (COVID-HOP) scale had an AUROC of 91% (CI 0.8725-0.9482 at 95%, p < 0.0001), and both the COVID-intensive care outcome prognostic (COVID-ICOP) and the COVID-mortality outcome prognostic (COVID-MOP) scales had an AUROC of 96% (CI 0.9448-0.9855 at 95%, p < 0.0001 and CI 0.9464-0.9872 at 95%, p < 0.0001, respectively) (Fig. 2).
The cutoff value for the COVID-HOP scale was found to be 52.7, while the COVID-ICOP was 113.1 and the COVID-MOP was 109 (Supp. Fig. S2). Thus, the complete algorithms with cutoff values were designed as detailed in Table 1, where the mathematical function of each marker, given by the Pearson's correlation index, would add each time the patient presents levels that exceed the reference values of said marker, and if the scale's cut-off value is exceeded by such sum, the patient would be considered at risk of either dying, needing regular hospitalization or intensive care.
Furthermore, the SE, SP, PPV, NPV (Table 2) and OR (Supp. Fig. S3) for each OP scale with the use of the respective cutoff values were calculated, finding that the COVID-HOP had a SE of 86%, SP of 74%, PPV of 90%, NPV of 94% and OR of 18.4 with a CI at 95% of 8.6-36, p ≤ 0.0001. On the other hand the COVID-ICOP had a SE of 87%, SP of 92%, PPV of 92%, NPV of 88% and OR of 88.5 with a CI at 95% of , p ≤ 0.0001. Finally, the COVID-MOP had a SE of 92%, SP of 91%, PPV of 82%, NPV of 96% and OR of 131 with a CI at 95% of 47-341, p ≤ 0.0001. Furthermore, 167 patients' records belonging to a different health center and that were not used to calculate the algorithms, were retrospectively studied to perform a validation of the SE, SP, PPV and NPV. Only 26% of these patients (32 individuals) were vaccinated against the coronavirus. The results for the MOP algorithm showed no variation in the second population tested, while the ICOP scale exhibited only minimal variation. In regards to the HOP algorithm, the specificity was considerably reduced (0.74 in the creation of the algorithm, 0.36 in the test of accuracy), but the other parameters remained without significant changes (Table 3). Finally, an online version of the algorithms was developed to facilitate its use, and can be found at: http:// benep achuca. no-ip. org/ covid/ index. php

Discussion
In the present research we assessed the degree of correlation of 16 biomarkers (three of them with 2 different reference limits) with three different outcomes (the future need for hospitalization and/or intensive care as well as the enhanced probability of mortality) by calculating the odds ratio, revealing six markers associated with the first outcome, 13 with the second, and 15 with the last. Nonetheless, when the data was binary transformed and analyzed by the means of a Pearson's correlation, only four markers were found to be associated with each marker: (i) Kirby < 200, LDH > 211, sO2 < 80 and CRP > 120 were highly associated with the requirement for hospitalization; Rx > 14, Kirby < 200, CRP > 120, and LDH > 400 were strongly related to the requirement for intensive care; and finally, age > 60, CRP > 120, Rx > 15 and Kirby < 150 correlated with a high mortality. We then developed three different algorithms, all of them based on adding the Pearson's correlation index for the markers that were relevant to each outcome, every time a patient developed pathological levels of a particular molecule. Interestingly, when calculating the mathematical functions in biomedical sciences a common approach is to perform a nomogram 27 , because of the underlying convenience of such technique. Nonetheless, the precision of such a graphic tool is not remarkable. In these circumstances, the addition of the Pearson's coefficients helped to develop a series of tools with enhanced sensitivity, as the COVID-HOP, COVID-ICOP, and COVID-MOP algorithms showed a sensitivity over 90% in each case.
Currently many meta-analysis 28-33 studying the risk factors and biomarkers for prediction of COVID-19 outcomes are available, but these are primarily based on cohort studies that are only representative of the Asian Table 2. Sensitivity, specificity, positive and negative predictive values for the outcome-prognostic scales. HOP hospitalization-outcome prognostic, ICOP intensive care outcome-prognostic, MOP mortality outcomeprognostic.  www.nature.com/scientificreports/ population, with minimal involvement of other genetic backgrounds. In this instance, the aforementioned studies' results reflect an enhanced degree of similarity for all the clinical and laboratory findings. Nonetheless, when different populations are studied the level of association of some biomarkers with the disease outcomes varies 34,35 , in such a way that the evaluation of prognostic markers in different populations may be of paramount importance to enhance the sensitivity and specificity of a prognostic tool. In accordance to this line of thought, here we present results derived from the analysis of a Mexican population, that reflect key differences in the association of prognostic markers with outcomes of enhanced pathology, in which the absence of a positive correlation between comorbidities and the worsening of COVID-19 stands out. However, in a validation experiment we observed that the degree of SE, SP, PPV, and NPV varied only in a slight manner, despite of using data belonging to patients from a different hospital, and with a quarter part of them having been vaccinated (a condition that was not present in the patients that provided the data for the elaboration of the algorithm). In any way, further research is needed to confirm if such homogeneity is paralleled in an international cohort. If the present tools does not possess enhanced precision, the development of specific algorithms for each region may be a viable option.
Finally, the chest CT evaluation is made subjectively according to the physician's appreciation, which could impair the results of the prognostics for enhanced mortality and intensive care requirement, as this marker has an increased weight into these algorithms. Nonetheless, excellent new technologies appear to be emerging on the field, in which such evaluation is made accurately 36,37 , and its widespread use may be helpful in the homologation of prognostic criteria.
Overall, these results show the development of three tools that may aid in the administration of hospital resources, including regular hospital beds, intensive care unit beds, and drugs. Such technology may be of enhanced utility in the context of the pandemic waves, which are expected to be a common occurrence in the coming years 38 , especially since no vaccine formula has been proven to produce sterilizing immunoglobulin titers 39 . In fact, expert committees have agreed that healthcare digital innovations are both lacking 40 and necessary 41 to enhance hospital resiliency, thus making necessary the development of this kind of tools.

Data availability
Data is available upon reasonable request to the corresponding author Alberto Navarrete Peón at investigacion@ benepachuca.com.